Basic control structures and variables

Nails for the hammer

As one of the prereqs for this course was some knowledge of MATLAB or a decent understanding of programming, we won't spend a huge amount of time on concepts, assuming you have a fair idea, and focus on the Python context. As one of the prereqs was some programming knowledge, if you find yourself stuck on a fundamental question, do ask others around you, and do help others around you, especially if I'm on the far side of the room. I'll not feel left out.

These first few tasks are all pretty unapplied and guided - this should get us underway, and we will then start learning Python by using actual science applications. Since we only have two part-days, and you already have used some coding, I will move through this introductory session fairly quickly, so you have the tools to explore a wide variety of Python aspects by the end of Day 2 - please discuss with each other and if you are still wondering about things during the break, do ask.

My recommendation is that the first time I run a cell, you do so too - I will show you how in a mo. This means everything I define and later use, you will have defined and can later use too.

First experiment with Jupyter

  • Find which line number you are in Etherpad
  • Click the blank box below saying In [ ]
  • Change it to read x = LINENUM (e.g. x = 15)
  • Press Ctrl+Return

In [54]:
x = # Insert something before the hash


  File "<ipython-input-54-e41ad21a45b7>", line 1
    x = # Insert something here
                               ^
SyntaxError: invalid syntax

Note that # is the comment symbol in Python - everything after it is ignored.

This just set the variable x (globally for this notebook) to 1

Click to edit the next In [ ] and press Ctrl+Return (without changing anything this time)


In [2]:
2 * x


Out[2]:
2

Hopefully, you should notice that x is indeed what you input. This should all be pretty familiar, unless you come from statically-typed languages (C/C++/Java) where you would have to declare a variable. Python lets you effectively declare it, and its type, by assigning to it.

We can refer to this output as Out[2], using it like any variable, without having to re-run everything so far.

Do the same trick in the next In [ ] (edit and Ctrl+Return)


In [3]:
if Out[2] / 2 == x:
    print("Thankfully, maths still works")


Thankfully, maths still works

There are a couple of things here to dissect. Many languages need braces {}, brackets [] or parentheses () to draw boundaries around the argument of the if statement, or the body. In Python, symbols are reduced in favour of words and spaces, for readability. To achieve the same effect, Python separates the condition from the body with the colon : and the indentation (four spaces, by convention)

Indentation is Python's Marmite for other programmers. As they are keen to point out, most whitespace does not matter: e.g. I could write

Out[2] / 2 == x :

and it's the same as

Out[2] / 2 = x:

. However, indentation (whitespace at the start of a line) does. It is an extremely visually clear way to show where blocks of code, like the body of this if statement, start and end. For scientists, it forces a very good programming practice - basic code style - your code doesn't run without if it isn't visually clear (well... neatly indented). One of the reasons I got into Python was that my PhD C++ code was so unreadable, dodgily indented and unweildy, it was easier to rewrite in Python than try and extend it - once I had modification was easy, I could see how everything was laid out, where functions started and ended, before even reading a character.

Another thing to note is that you can print with print. Here we are running Python3, the latest version. If you've played with Python2, you will notice this syntax is a bit different - it has parentheses (aka. parens) around the argument. A whole raft of changes came in with Python3 and, like XP, Python2 will be around for some time to come through sheer momentum. However by 2016, almost all important libraries are now Python3-ready and I, for one, develop new software in Python3 only, where possible. Good porting tools exist for moving old code.

To summarize those key points:

  • Indentation separates bodies of functions, conditionals, loops from the outside code
  • Basic boolean operators are ==, or, and, not, >, <, etc.
  • This is Python3 syntax (very slightly different to Python2)

Debugging

Ladder for the hole

Now we can edit and run code, the inevitable consequence is that we get a bug!

Try running the line below (without correcting it):


In [5]:
if (x < 5) or (x > 100):
    print('Do I know you?')


Do I know you?

Lots of pretty colours! What do they mean?

In general, the most immediately useful information is on the very last line: SyntaxError. Above it, is the last "frame" called - that is, level of function. Say you call a function, and it calls another, and so forth until the innermost function hits an error. Here, it will list all the functions in that sequence, so you can track exactly how you got to the dodgy line.

In Jupyter, the filename is not very informative - but you can see the input index (ipython-input-4 &lrarr In [4]).

Bonus mark if you spotted the error (not that we actually have marks, but hey) - have a look and see if you can fix it and re-run the command, the same way you ran it the first time. Feel free to discuss, use Etherpad chat, whichever. Note that the In [4] changes to In [5], and so forth, each time you run that "cell".

In general, "a string" and 'a string' are equivalent - Python does not mind whether single or double quotes are used, as long as they match at either end.

[PTW: keep slide on screen to view syntax error]

  • Check the error at the bottom
  • Check the caret (^) showing where Python sees a problem
  • Look at the syntax highlighting in the cell

Functions

Painting the Black Box

Like every language, Python has functions... they are defined like so:


In [52]:
def tell_me_if_im_old(name, age):
    print("Is", name, "old?")
    if age < 30:
        return "No"
    else:
        return "No"

Couple things to note: (i) this has nested blocks, we just indented twice for the inner ones (if and else bodies). (ii) this didn't specify a type for age (or name) - all it cares is that we can compare age to an int. Similarly, we don't specify a return type - it is up to us to return something intuitive and sensible. Let's use it:


In [53]:
phils_age = 103
tell_me_if_im_old("Phil", phils_age)


Is Phil old?
Out[53]:
'No'

This one I'm saving for future use. You can see that this is used pretty much like every function in many languages - name, parens containing arguments.

There are other types of function - those on objects. Rather than cramming an introduction to object-oriented programming (OOP) now, especially as a number of you have done Computer Science, I have added an Appendix to these notes about my non-existent dogs, "Freddie" and "Nitwit". If OOP confuses you, we can use a bit of the break if you're brain isn't fried to go through those.

For the moment though, what matters is that you can add a dot to most things in Python and access a number of properties of that variable or things it can do. A couple of examples to make it clearer:


In [9]:
"just a normal string".upper()


Out[9]:
'JUST A NORMAL STRING'

In [10]:
(3 + 2j).imag


Out[10]:
2.0

In [11]:
"Another string".islower()


Out[11]:
False

Lists and dicts

Conveyor belts and storage units

Like any language, often need to be able to group things in Python.

We can create a list, which is like a one-dimensional C or FORTRAN array, as follows (don't forget to execute it!):


In [14]:
things_to_do = ['Learn Python', 'Finish PhD', 'Publish research',
                'Accept Nobel prize', 'Inspire a new generation']

The syntax here is the same as for the more basic types we saw so far, variable = value, and our list is a comma-separated series of things, bounded by brackets [].

Once you execute the cell above, you should be able to try this:


In [15]:
sorted(things_to_do)


Out[15]:
['Accept Nobel prize',
 'Finish PhD',
 'Inspire a new generation',
 'Learn Python',
 'Publish research']

...and get an alphabetical list of tasks. Note that sorted doesn't update the original object, so if you re-run without sorted, it's back to the previous order.

To do this, you use a slightly different technique or you just assign: things_to_do = sorted(things_to_do).

sorted guesses the ordering you want, based on the type of data. It doesn't all have to be strings, you could have mixture of strings and integers, say. sorted can take optional arguments, letting you force it to do things the way you want, or even provide your own ordering function.

So how do we get things out of a list?

Lists are ordered, so you can say, "I'd like the fifth item, please". You put it in brackets after the list name:


In [16]:
things_to_do[4]


Out[16]:
'Inspire a new generation'

However, note that...

PYTHON IS ZERO-INDEXED

Like C, but unlike MATLAB or FORTRAN. The first item in a list is "things_to_do[0]"

By the way, did you notice that In[] and Out[] are lists?


In [17]:
type(In)


Out[17]:
list

You can try len as well!

Slicing

Sounds cool, is cool

Suppose I want a bit of a list, like the middle 3 items, or the first 2, or last 2...


In [18]:
things_to_do[1:4]


Out[18]:
['Finish PhD', 'Publish research', 'Accept Nobel prize']

...if you wonder why things_to_do[4] (the fifth item) didn't appear, note that that number after the colon means 'up-to-but-not-including'...


In [19]:
things_to_do[:2]


Out[19]:
['Learn Python', 'Finish PhD']

In [20]:
things_to_do[3:]


Out[20]:
['Accept Nobel prize', 'Inspire a new generation']

In [ ]:
things_to_do[-2:]

We can use negative numbers to count back from the end instead...

Which is "start two from the end, and give me everything from there onwards"

Dictionaries

This is fine - a list is like a conveyer belt, with a load of arbitrarily filled boxes passing by, one after the other. However, sometimes we care which item is which, and want to have a name for it, so we can drop in and grab it wherever it may be. That's where dictionaries (= dicts) come in. Try running this:


In [21]:
my_meetup_dot_com_profile = {"first name": "Ignatius", "favourite number": 9,
    "favourite programming language": "FORTRAN66", 3: "is the magic number"}

This is completely unsorted. However, you can get any element back by its name (key):


In [22]:
my_meetup_dot_com_profile["last name"]


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-22-9c5e2f7d2483> in <module>()
----> 1 my_meetup_dot_com_profile["last name"]

KeyError: 'last name'

Hmm... I'll let you fix that...

And even though it is unsorted, it can have integers or so forth as keys - note the very last entry in the definition...


In [ ]:
my_meetup_dot_com_profile[3]

In [ ]:
my_meetup_dot_com_profile[2]

...but not...

Because it doesn't exist.

Extending lists and dicts

Extending lists is a little unwieldy, but dicts are more intuitive:


In [23]:
things_to_do.append("Find a nice retirement village in the Galapagos Islands")
my_meetup_dot_com_profile["Interests"] = ["Python2", "Python3", "Scientific Python", "Pottery"]
print("TODO:", things_to_do, "\n\nMEETUP:", my_meetup_dot_com_profile)


TODO: ['Learn Python', 'Finish PhD', 'Publish research', 'Accept Nobel prize', 'Inspire a new generation', 'Find a nice retirement village in the Galapagos Islands'] 

MEETUP: {3: 'is the magic number', 'first name': 'Ignatius', 'favourite programming language': 'FORTRAN66', 'Interests': ['Python2', 'Python3', 'Scientific Python', 'Pottery'], 'favourite number': 9}

Note that anything can be an item in a dictionary or list - in this case, a list is an item in our dict. Also note that, as with other languages, \n signifies a newline.

Join

The most useful method in the str object

Different languages have different ways of collapsing a string with a separator - it is something you will need to do again and again. In Python, the approach is a little surprising - we have already seen that strings are objects (of type str), but it turns out Python uses this to provide a way of joining iterables (like lists or dicts):


In [24]:
', '.join(things_to_do)


Out[24]:
'Learn Python, Finish PhD, Publish research, Accept Nobel prize, Inspire a new generation, Find a nice retirement village in the Galapagos Islands'

This way around, all the string needs to do is add itself to before each item in the iterable (except the first) - as long as it can keep getting items, it doesn't care what the iterable is, a dict or anything.

Can you see the benefit of defining this method as part of str, being able to pass it any possible iterable to join up, instead of having to define a method on every type of iterable and passing it a string?

Quick aside

Save early, save often

Jupyter does do some automatic saving, but click the disk icon at the top left now to force an update...

ReDebugging

Bringing it together

The syntax is slightly different. Instead of brackets, we use braces {}.

Now each element has a name, followed by a colon and the element itself, which can still be basically anything. Like storage units, we now have a group of things we can address. That address could be a number, a string, or any other basic type. And keys can be different types - check the last entry.

Here's another one to see if you can fix - try running and then spot the issue:


In [ ]:
y = 120
for x in range(5):
    y = y / x
    print("When x is", x, "then y is", y)

Output should be:

When x is 1 then y is 120.0
When x is 2 then y is 60.0
When x is 3 then y is 20.0
When x is 4 then y is 5.0

Couple of bits of useful information:

  • for is, unsurprisingly, a for loop, as in other languages
  • for doesn't have limit arguments, like in C, say - it takes a set/list/(anything iterable) and goes through each element
  • range just returns a list of integers
  • range can take one argument or two arguments (or more, but not so relevant now)
  • range documentation is here : Python3 range syntax (link opens in new window, so click it!)

The easiest way, IMO, to info on any Python function or library is to Google "Python3 funcname" and click the first python.org link. If even something seems wrong, make sure you are looking at the Python3, not Python2 docs (or v.v.)

If it hasn't already, when you solve this (i.e. get the output above), please make a note in Etherpad. If you are still solving the puzzle - don't look at Etherpad unless you want the answer.

To see what range actually returns, you can run:


In [30]:
range(5)


Out[30]:
range(0, 5)

Well, that wasn't very descriptive. It turns out the range function returns something with (a confusingly named) type range - this is what for gets handed. Don't worry about the ins-and-outs of that type just yet - what matters is: things of type range look like, act like and sound like a list.

As far as for is concerned, that's good enough - this is the duck test, and is a key paradigm in Python - when writing code, don't require that input to be of type float, or type int, just complain if it doesn't do what you want. And range can iterate, like a list, which is all for wants.

To prove this, lets force it into a list and see what happens...


In [33]:
list(range(5))


Out[33]:
[0, 1, 2, 3, 4]

Essentially, for sees what we see - a sequence of five numbers.

Bear in mind that you can cast like this to various types. str will make something a string...


In [36]:
str(3.14) + " is almost pi"


Out[36]:
'3.14 is almost pi'

Unlike some languages, Python does care about type, so you must concatenate strings with strings or add numerical types to numerical types. To check the type of any number or variable, you can use the built-in, type:


In [37]:
type(str(3.14))


Out[37]:
str

A string is, well, kind of like a list of characters, right? Certainly true in C and FORTRAN. So can "for" iterate over that too?


In [38]:
d = ""
for i in range(2):
    d += "Let me hear you say "
    for c in "YMCA":
        d += c + '.'
    d += "  "
print(d)


Let me hear you say Y.M.C.A.  Let me hear you say Y.M.C.A.  

Yes, yes it can: check out the 'for c in "YMCA"'. We also snuck in a nested loop - you just keep indenting each time you nest - no more complicated than that.

And, admittedly, a new operator has appeared: "+=". Familiar to most languages it is equivalent to d = d + blah. Note that we therefore have to have d defined before the first +=, as "d = d + anything" doesn't make sense if d does not already exist. This is the reason for our first line: "d = """.

Finally, note that we have looped twice using range, just as above, but our loop variable i is never used - it is effectively a placeholder. All we want from that line is to have the body below run twice.

Sidenote: some people use an underscore "_" as a loop variable in this case, to indicate to someone reading that the variable is nothing more than that and never used - as far as Python is concerned, "_" is as good a name for a variable as any other, so this is entirely about readability.

Just to prove that "for" will iterate over any list, numeric, string or otherwise, we can try it with our todo list:


In [39]:
total_months = 0
for task in things_to_do:
    total_months += len(task)
    print(task, ":")
    print ("  using Python, this task will take", len(task), "months to complete")
print("You cannot retire for at least", total_months // 12, "years")


Learn Python :
  using Python, this task will take 12 months to complete
Finish PhD :
  using Python, this task will take 10 months to complete
Publish research :
  using Python, this task will take 16 months to complete
Accept Nobel prize :
  using Python, this task will take 18 months to complete
Inspire a new generation :
  using Python, this task will take 24 months to complete
Find a nice retirement village in the Galapagos Islands :
  using Python, this task will take 55 months to complete
You cannot retire for at least 11 years

Here we have added one or two minor surprises. One is the horrendously misleading use of len (hint: you are unlikely to get a Nobel Prize by 2020), which in reality calculates the number of items in any "iterable". This is the general term for something sort-of-list-like. According to my script, tasks take as long as the number of items in their "list", i.e. characters in the string.

Another aspect is the double slash. This is an important difference between Python3 and Python2 (and many other languages). Dividing non-divisible integers with one slash, as normal, gives a float in Python3. If you want to get an integer (that is, whatever the float answer is with the decimal chopped off), you can use double-slash. Try removing the second slash and re-running to see the exact non-integer answer. In Python2, using one slash always gives an integer, unless you cast the bottom or top to a float (as in C, say).

Modules

Gotta catch em all

Introduction to modules

  • Where Python gets its power

Wealth of modules providing everything from full GUI toolkits to astrophysics simulations. Mature and reliable numerics libraries, and an evolving ecosystem that grows day by day

  • Some modules come with Python, but you can install many more

Python has tools so you can just specify the name and it will get it from the appropriate online repository. On Windows, this can be tricky, but a recent Python distribution, Anaconda, makes it easy, and includes thousands of packages out of the box. This seems your best bet if on Windows or Mac and I have USB sticks here for you to get going at the end of the day.

  • Some are part of Python, lots are third-party

It is good to have an idea of which is which, as you can get (normally free) help, often at pretty short notice, by heading to the project's forums - remember, always be polite and, bear in mind, it isn't a commercial service, so an answer isn't guaranteed. For reference, when someone is really desperate they can offer a bounty for a solution or a bit of code on certain websites, not to mention people offering consultancy support, so that doesn't entirely mean there are no other options, but most of the time what's freely available is more than adequate.

Using modules

First off, you import a module. This tells Python that it should hoke it out of its cave of treasures for upcoming use...


In [40]:
import math

This brings in a whole set of tools for dealing with basic mathematics. (so execute it!)

We reach into this with a dot...


In [41]:
math.pi


Out[41]:
3.141592653589793

It has functions and variables...


In [43]:
math.sin(math.pi / 2)


Out[43]:
1.0

So how do you find out what is available in math? Like before, Google python3 math... try it now: https://www.google.ie/search?q=python3+math (this is just the Google search link)

For me, the closest version of Python was actually the second link (for Python 3.5) - normally, minor version differences are rarely a problem, but if what you see seems different to the manual, just add Google for "python 3.4 math", or whatever. To check the current version that Jupyter is using, go to Help->About.

Another useful one is os...


In [73]:
import os
print(os.path.exists('/usr/bin/python'))


True

This is in fact using a submodule, os.path, we can reach in twice to get functions inside that.

Also sys...


In [74]:
import sys
print(sys.path)
# sys.exit(1)  # Exit with error code 1


['', '/usr/src/jupyter-notebook', '/usr/lib/python3.4', '/usr/lib/python3.4/plat-x86_64-linux-gnu', '/usr/lib/python3.4/lib-dynload', '/usr/local/lib/python3.4/dist-packages', '/usr/lib/python3/dist-packages', '/usr/local/lib/python3.4/dist-packages/IPython/extensions', '/root/.ipython']

The last command tells your script to exit, with code 1. I don't really want to try that right now, as whatever happens won't be good.

There are four key variants of importing

You will see all four, so check back here


In [44]:
import math
print(math.e)


2.718281828459045

This is the most succint as it just dumps everything in the math module into the global namespace, which saves much typing, but it can make debugging a headache, especially if there are some strange functions or variables in there that happen to have the same name as something else you later decide to use (this does happen from time to time)


In [45]:
import math as m
print(m.e)


2.718281828459045

...we've seen that...


In [46]:
from math import e
print(e)


2.718281828459045

...in other words, give math an alias...


In [47]:
from math import *
print(e)


2.718281828459045

...which saves some typing if you're sure you don't want to use any other variable called e...

Pythonic

Walks like a Python...

This is a fundamental principle of Python and, unusually for something for fundamental, it isn't part of the code. Python is a bit like Ultimate Frisbee - in case you haven't had the pleasure, referees are not required, even in international competitions. This is on the basis that, if you're playing the sport, then you've decided you're there with the ethos, and the players can sort out those decisions amongst themselves and get on with the game. If you're trying to take advantage of that, then the fundamental question is, why play Ultimate Frisbee?

Good Python practice frequently diverges from what is commended in other languages. Succinctness or efficiency is not necessarily good - clarity comes before cleverness. However, using a pattern from previous experience when Python has a neater, more beautiful, more Pythonic way - that's something to try to avoid. The upshot of all this is that, when you come to what the community considers "good code", it is simple, elegant and clear.

Not-so-Pythonic way:


In [58]:
for i in range(len(things_to_do)):
    print(things_to_do[i])


Learn Python
Finish PhD
Publish research
Accept Nobel prize
Inspire a new generation
Find a nice retirement village in the Galapagos Islands

More Pythonic way:


In [60]:
for task in things_to_do:
    print(task)


Learn Python
Finish PhD
Publish research
Accept Nobel prize
Inspire a new generation
Find a nice retirement village in the Galapagos Islands

Same output, but one feels more like pseudocode, more like how you would verbally convey the idea if somebody asked.

If you want to get what Python is about, try this (literally):


In [61]:
import this


The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

This is one of the hardest things to get when switching to Python, particularly from compiled languages, but it is one of Python's greatest rewards. Nothing much beats opening up a script you hacked together two years ago to see zen-like transparency - no confusion, beautifully professional, and thanks to Python's design, no extra time spent writing it, just a feel for what Pythonic is.

A couple of resources...

  • PEP (Python Enhancement Proposal) 8 - code style

These are publicly accessible design documents and nearly everything new of importance passes through here. PEP8 is, in fact, a style guide; unlike many languages, there is more or less a definitive recommended style, and this spells it out. Since I discovered flake8, I have my IDE set to highlight style errors and it has reaped dividends to the moon and back - I strongly recommend you do so too, especially if we ever have to code together.

Exceptions

Are more than ordinary

In the words of one of early computing's most idiosyncratic legends, Rear Admiral Dr Grace Murray Hopper, "It is often easier to ask for forgiveness than to ask for permission". A distinguishing feature of Python is that it is built from the ground up with this maxim in mind, known as EAFP. In practice, this means preferring exceptions over tests, so using a try-except block (or, elsewhere, try-catch, etc.) instead of an if statement when checking whether you can perform an action.

Simple examples where you are more likely to use an exception in Python than another language:

  • When making a directory

Can you see why this might be useful? If you use an if to check for non-existence, there is a potential race condition: by the time you reach the body, it could be made. If you catch the exception, you know it existed exactly when you tried to make it. (Although, the recommended routine, os.makedirs, has an optional don't-complain-if-dir-exists argument, which is probably even better)

  • When testing file existence

Again with the race condition - this is in fact a security issue, as an attacker can create the file between you checking for existence and opening it for writing. If they create it, of course, then they set the permissions and can see the content regardless of your attempts to block reading.

  • Checking type

We talked a lot about duck-typing... the underlying principle is that you never reject input types as long as they work. Now imagine you have a routine with an argument x, where you want to do one thing if x is numeric type and another if it's, say, a string... if you use if and check their type, well what if this is some weird subclass of float that your routine has been sent, or the author of this type has carefully implemented all the necessary magic functions to make it quack like a float, but it's a completely unrelated class?

Push the boat out and see if it floats. Try casting to a float and if it doesn't work catch the exception. Then try casting to a string. Everything should cast to a string somehow. Now you have made checking numeric-ness a problem of float, which is infinitely more qualified to do this than you are.

  • Checking for a dictionary key

Maybe what you think is a dictionary isn't - it's something that, when you request thingy["something"] will check whether "something" is something it might dynamically add, and, if so, will gladly return the result. If you check first (if "something" in thingy:), either the answer is misleading ("no"), or what looks like an innocent if-clause is modifying your dictionary. Moreover, even for an ordinary dictionary, having an if-clause followed by a retrieval hits your dictionary scan twice - try-catch only searches once.

Hopefully this motivates the idea of exceptions before tests - EAFP. Why then have you been repeatedly told not to do this in, as Python calls them, LBYL (Look Before You Leap) languages? The answer is usually that exceptions are horrendously slow and inefficient. In Python this isn't true, by design. They are so fundamental to the language that every loop in fact ends, not with a failing test, but when the iterable throws a specific exception (the StopIteration exception).

Try a try

The actual syntax is similar to what you will have seen elsewhere:


In [62]:
try:
    something_stupid()
except:
    print('Doh!')


Doh!

Well, the first thing is that we didn't define something_stupid. So we get an exception and our extremely generic except provides no useful information.

An improvement


In [63]:
try:
    something_stupid()
except Exception as e:
    print('Doh! You forgot that', e)


Doh! You forgot that name 'something_stupid' is not defined

Definitely better, and note that print will cast e, the exception, to a string - it is actually a more complex object:


In [68]:
try:
    something_stupid()
except Exception as e:
    print(', '.join(dir(e)))


__cause__, __class__, __context__, __delattr__, __dict__, __dir__, __doc__, __eq__, __format__, __ge__, __getattribute__, __gt__, __hash__, __init__, __le__, __lt__, __ne__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __setstate__, __sizeof__, __str__, __subclasshook__, __suppress_context__, __traceback__, args, with_traceback

Note in particular that we can get the entire traceback from e.traceback and that e can have arguments when it is thrown - retrieved via e.args. We will worry about throwing (actually generating a new exception) later, but suppose we don't actually want to stop the exception bubbling up, just to some logging or tidy-up on the way through.


In [69]:
try:
    something_stupid()
except Exception as e:
    print('I am *not* cleaning your mess for you, deal with it yourself!')
    raise e


I am *not* cleaning your mess for you, deal with it yourself!
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-69-52801b794bde> in <module>()
      3 except Exception as e:
      4     print('I am *not* cleaning your mess for you, deal with it yourself!')
----> 5     raise e

<ipython-input-69-52801b794bde> in <module>()
      1 try:
----> 2     something_stupid()
      3 except Exception as e:
      4     print('I am *not* cleaning your mess for you, deal with it yourself!')
      5     raise e

NameError: name 'something_stupid' is not defined

Now we get the same exception we would have got the last time, but the raise keyword has kept it moving on past. This is quite useful, as Python exceptions tend to have lots of juicy info we wouldn't want to lose by ending our except block with a boring print('oh noes') and program exit.

Better yet

So, at the moment, we catch every possible exception. This is probably a very bad idea, as, usually when we catch an exception, it is because we expect a particular thing to go wrong. As a case in point:


In [76]:
try:
    os.makedirs(dirname)
except:
    # Great, someone has already created that directory
    # We can carry on!
    pass
# lalalalala...
print("just mucking about with my friend dirname")


just mucking about with my friend dirname

First note pass. This is required because every block must have at least one non-comment line - if nothing else is there, we can use pass - it is Python's no-op if that helps.

Now ask yourself, "did we ever actually define dirname?" No? Then as soon as we use it after our supposed check, we will get an unhandled NameError exception.

We can specify what type of exceptions we catch... dir-already-exists have type OSError.


In [77]:
try:
    os.makedirs(dirname)
except OSError:
    # Great, someone has already created that directory
    # We can carry on!
    pass
# lalalalala...
print("just mucking about with my friend dirname")


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-77-4aa6d763f870> in <module>()
      1 try:
----> 2     os.makedirs(dirname)
      3 except OSError:
      4     # Great, someone has already created that directory
      5     # We can carry on!

NameError: name 'dirname' is not defined

Now we have stepped out of the way of genuine errors coming through. We can actually handle different exception types in different ways - there are plenty of cases this might be useful, if you want to do something awkward which could fail in five different directions.


In [81]:
try:
    os.makedirs(dirname)
except OSError:
    # Great, someone has already created that directory
    # We can carry on!
    pass
except NameError:
    print("""
    This is the third code example
    where you haven't defined `dirname`.
    Seriously, catch yourself on.
    """)
# lalalalala...
print("just mucking about with my friend dirname")


    This is the third code example
    where you haven't defined `dirname`.
    Seriously, catch yourself on.
    
just mucking about with my friend dirname

Yeah. Great.

Useful tool just slotted in there: multi-line strings. If you start a string with three quotes, you can keep going on and on until you hit another three.

One final example to show how, for this particular case, you really can be a little more specific.


In [82]:
dirname = "/etc/passwd"
try:
    os.makedirs(dirname)
except OSError:
    # Great, someone has already created that directory
    # We can carry on!
    pass
# lalalalala...
print("just mucking about with my friend", dirname)


just mucking about with my friend /etc/passwd

Emm...we tried to make a directory with the same name as a key system file and are blithely assuming that every failure in doing so is simply because it is an existing directory. Not so good.


In [103]:
import errno

dirname = "/etc/passwd"

try:
    os.makedirs(dirname)
except OSError as e:
    if e.errno != errno.EEXIST:
        raise
    # Great, someone has already created that directory
    # We can carry on!
# lalalalala...
print("just mucking about with my friend", dirname)


just mucking about with my friend /etc/passwd

This emphasises the fact that subclasses of Exception often have additional contextually-relevant properties, which you should use. It also points out that if statements still have their place in error-handling!

When raise has no argument, it re-raises whichever exception it was that got us into this mess.

A few important Exceptions

That you might want to catch

  • KeyError - box_of_tricks["not here"]
  • TypeError - 1 + "banana"
  • IOError - open('/etc/passwd', 'w')

Exceptions are often more pythonic than pre-checking. This Python2 doc is a good start on that road. CRUCIAL READING!

We are near the end of the walk-through

Feel free to go back and forward through this. I am going to give you one final task to get you started on this topic...

Remember your line number from Etherpad? If not, it should be in variable x...


In [48]:
print("I am on line number", x)


I am on line number 1

But double-check Etherpad...

Remember the post-its? So, everybody stick your star post-it somewhere I can see it...

Your task is to write a Jupyter cell to give you the $x^{th}$ digit of $pi$ and write the digit at the start of that Etherpad line, the one with your name on it. In particular, try and do this, just using x, without typing the number in directly. Then you can experiment to see if other numbers work too.

There are several ways to do this... I've given you enough tools above to find at least one. If, when you get it working, no-one has described your approach on Etherpad, write a short description at the bottom of the notepad.

When done, swap your star for an arrow.


In [56]:
# Overwrite this cell with your code and run it

So, hopefully, you now have a calculator for the $x^{th}$ digit of $\pi$. How can you use this without going back and editing the cell?

Well... copy and paste your text below the line in the cell below:


In [ ]:
def get_xth_digit_of_pi(x):

Now indent everything beneath the def line by four spaces and run it. You have just defined a function! If so, this should give you a 1:


In [57]:
get_xth_digit_of_pi(4)


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-57-f6b21e9ee432> in <module>()
----> 1 get_xth_digit_of_pi(4)

NameError: name 'get_xth_digit_of_pi' is not defined

If you had errors that don't make sense, try the chat window, try someone beside you (my normal approach in life) or ask me. While you are doing that, I am going to wrap up...

What about all the other structures??

Well, they are not so hard once you get this far:

but you'll come across them throughout today

  • Loops can be while [SOME BOOLEAN]:, as well as for
  • We will explore some actual modules after lunch
  • Generators put loops inside lists

I'll let that boggle your mind for a moment. Our future sessions are not going to be so closely led, partially because it cannot be as interesting experiment with my code as it is getting your hands dirty with your own. Now you've got plenty of tools at your disposal.

Now, I'm going to do a quick tour of scripting applications of Python, a few ideas you might want to ask me about over lunch, to give you an idea of what Python can do, so sit back relax and let me fiddle with projector technology at the front for a couple of minutes...

Finally...

After a whistle-stop tour, we pull into a Python siding...

  • Please write one great thing about this morning on your arrow and leave it beside the door
  • Please write one dire (or just not so good) thing about this morning on your star and leave it beside the door

PS: Tuples

The Ice-List

Tuples are basically frozen lists...


In [25]:
target_coordinates = (56, -5)
print(target_coordinates[0], "N", target_coordinates[1], "E")


56 N -5 E

...but it's fixed, so you can't do this...


In [26]:
target_coordinates[1] = 38


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-6b84172f9a38> in <module>()
----> 1 target_coordinates[1] = 38

TypeError: 'tuple' object does not support item assignment

If you alter it to set the whole variable (try it), that's fine though - you can alter what target_coordinates points to, you just can't alter the tuple itself.

Lists are constantly changing - length, content, type of content. Really, they aren't much like arrays in compiled languages like C or FORTRAN. They are far too flightly to be used as something like a dictionary index for example. However, supose I have a pair - exactly two elements, one a known int and one a known string, and it can't change. Well, if you stick two basic types together then why can't you do the same thing with it you can do with basic types? There's no weird changing behaviour going on, so if you freeze a list into something like that, can you use it as a dictionary index?

In short - yes you can!


In [28]:
battleships = {}  # new dict
coordinates = [3, 2]
battleships[coordinates] = "HIT"

So it doesn't work with a list - try it now with a tuple - swap the square brackets on the second line for parens () and re-run


In [29]:
print("At", coordinates, "we have a", battleships[coordinates])


At (3, 2) we have a HIT

Aside: A hashable type can be used as an index in a dict - tuples are, lists aren't (because they can change, aka are mutable)

PS: an introduction to objects...

Objects

Totally class

If you haven't done object oriented coding before, I am afraid I am not going to give you the magic bullet now. It is an important concept to get to grips with, but for the moment, we will keep it fairly functional. Do take time to look through it later - Python is an excellent language for playing around with it and getting an intuition for what an object is. A number of you will be very familiar with this so please bear with me and the very rough introductory description I will give - for those who are not so familiar, this is worth getting the basics down early.

In real life, everything has properties, and many things have things they can do. In Python these properties are called attributes and the things an object can do are called methods. Together these are called members and, in Python, are accessed by putting a dot after the object and the attribute or method name. Everything else is just like any other variable or function, respectively.

For instance, my dog Freddie has properties - he has a colour, which is black (technically, he is imaginary, but in my head he is definitely black). In Python terms, this would be:

Attribute: freddie.colour = black

I can tell Freddie to roll-over. When I do so, I am, in some sense, calling Freddie's method:

Method: freddie.roll_over()

I have another dog, Nitwit. Nitwit can also roll over...

Method: nitwit.roll_over()

Both dogs can shake hands, but I need to tell them which paw...

Method: freddie.shake_paw('left')

At this point, you're probably wondering what exactly is the set of attributes and methods that my dogs have? This template, showing what attributes and methods a dog of mine can be expected to have, is called a class.

A simple class


In [6]:
class PhilsDog:
    name = ""
    colour = ""
    def shake_paw(self, side):
        print("My name is", self.name, "and I am shaking my", side, "paw like a good dog")

Here we create a class called PhilsDog that all of my dogs implement (that is, they are of that type). It indicates that they will have a colour and that they have a method called shake_paw. Think of this from the perspective of the dog - the first argument, self, is a little Python magic that refers to the dog itself. This lets the dog use it's name and colour (and any other methods) in the shake_paw method. The second argument is which paw I told my dog to shake. In response, any dog of mine says "Shaking left/right paw like a good dog". That's quite impressive, but I'd rather they just learned to zip it and actually shake their paws instead.

Now, that's just the template for one of my dogs, so how do I use it?


In [7]:
freddie = PhilsDog()
freddie.name = "Freddie"
freddie.colour = "black"

nitwit = PhilsDog()
nitwit.name = "Nitwit"
nitwit.colour = "brown"

print("Freddie is", freddie.colour, "while Nitwit is", nitwit.colour)


Freddie is black while Nitwit is brown

I call the class like a function. This creates a new PhilsDog object - in computing terminology, I instantiate the class, creating a new instance of PhilsDog. Freddie is an instance of PhilsDog and so is Nitwit. But PhilsDog is just a template - as you can see, both Freddie and Nitwit have their own colour and name. I can update these just as any variable, and Freddie's doesn't affect Nitwit's, and I can read both back out again.


In [8]:
freddie.shake_paw('left')
nitwit.shake_paw('right')


My name is Freddie and I am shaking my left paw like a good dog
My name is Nitwit and I am shaking my right paw like a good dog

Good dogs. Here we see the shake_paw method being called for each dog and with a different side parameter. To remind you, the body of the method was:

def shake_paw(self, side):
        print("My name is", self.name, "and I am shaking my", side, "paw like a good dog")

You can see how the self.name matches the name I gave each dog on the previous slide.

If that was familiar to you, then that was probably a very unexciting few minutes - if not, it's probably come and gone very quickly. Thankfully, we only need to know that, if we have an object, we can get its attributes and methods by adding a dot and the attribute/method name.

Now why is that important...

In Python

Everything is an object (nearly)


In [9]:
"just a normal string".upper()


Out[9]:
'JUST A NORMAL STRING'

In [10]:
(3 + 2j).imag


Out[10]:
2.0

In [11]:
"Another string".islower()


Out[11]:
False

Two useful tools


In [12]:
type(freddie)


Out[12]:
__main__.PhilsDog

"type" lets you examine what class an object is. Don't worry about that __main__ just for the moment.


In [13]:
dir(freddie)[-3:]


Out[13]:
['colour', 'name', 'shake_paw']

This is a list of all the members of freddie. If he hasn't had any added on the fly (which is possible in Python), this is the same as the members of the PhilsDog class. That syntax in the brackets is coming up in a couple of slides, but basically it means, the last three items. Python denotes somewhat magic functions with double-underscores on either side (they are all the previous items I'm hiding away) - there is good reason for these being here, but they aren't essential just now. Try removing the bit in brackets, including the brackets, to see what you get.